Conversation
bob-carpenter
left a comment
There was a problem hiding this comment.
Thanks for looking this stuff over! The code changes are good, but I'm reluctant to change the original authors' language on mastery vs. knowledge given the definitions.
|
|
||
| The *deterministic inputs, noisy "and"" gate* (DINA) model [@Junker2001] is a popular conjunctive CDM, which assumes that a respondent must have mastered all required attributes in order to correctly respond to an item on an assessment. | ||
|
|
||
| To estimate respondents' knowledge of attributes, we need information about which attributes are required for each item. For this, we use a Q-matrix which is an $I \times K$ matrix where $q_{ik}$=1 if item $i$ requires attribute $k$ and 0 if not. $I$ is the number of items and $K$ is the number of attributes in the assessment. |
There was a problem hiding this comment.
Seung Yeon Lee's original text says:
The presence and absence of attributes are referred to as “mastery” and “non-mastery” respectively. A respondent’s knowledge is represented by a binary vector, referred to as “attribute profile”, to indicate which attributes have been mastered or have not.
So I think either "knowledge" or "mastery" would work here (knowledge being the set of attributes mastered), so I'd be reluctant to change this.
There was a problem hiding this comment.
Right, so attributes are 'mastered' (not 'known'). The high-level concept of knowledge is characterized by (low-level) attribute mastery. I'm going by the author's terminology. It's not a big deal, it's just that using another term to mean the same thing or using metonymy (as is the case here) can create unnecessary extra cognitive load, if not confusion. And you don't want this in materials which are already difficult to digest. But, if you insist, I will, of course, revert this change.
There was a problem hiding this comment.
I also find this terminology confusing. It's doubly confusing because it's trying to overload two commonly used nouns with technical meanings. The original author of the DINA model introduced the terminology, it looks like. We're often walking the line between our general preferences for naming, etc., and that of a subfield we work with. Andrew's particularly prone to try to change entrenched terminology.
| To estimate respondents' mastery of attributes, we need information about which attributes are required for each item. For this, we use a Q-matrix which is an $I \times K$ matrix where $q_{ik}$=1 if item $i$ requires attribute $k$ and 0 if not. $I$ is the number of items and $K$ is the number of attributes in the assessment. | ||
|
|
||
| A binary latent variable $\alpha_{jk}$ indicates respondent $j$'s knowledge of attribute $k$, where $\alpha_{jk}=1$ if respondent $j$ has mastered attribute $k$ and 0 if he or she has not. Then, an underlying attribute profile of respondent $j$, $\boldsymbol{\alpha_j}$, is a binary vector of length $K$ that indicates whether or not the respondent has mastered each of the $K$ attributes. | ||
| A binary latent variable $\alpha_{jk}$ indicates respondent $j$'s mastery of attribute $k$, where $\alpha_{jk}=1$ if respondent $j$ has mastered attribute $k$ and 0 if he or she has not. Then, an underlying attribute profile of respondent $j$, $\boldsymbol{\alpha_j}$, is a binary vector of length $K$ that indicates whether or not the respondent has mastered each of the $K$ attributes. |
There was a problem hiding this comment.
[doesn't need to be fixed for this PR]
The lines should be max 80 characters if possible because otherwise we get this mess in Git with edits where we are looking at 300 character lines.
There was a problem hiding this comment.
I completely agree. The command-line diff is also showing me that there is a trailing whitespace, so I'll fix both formatting issues at the same time.
| # References | ||
|
|
||
| <!-- This comment causes section to be numbered --> No newline at end of file | ||
| <!-- This comment causes section to be numbered --> |
There was a problem hiding this comment.
[not required for this PR]
Are these lines different? You can also just specify whether sections are numbered in the top-level yaml.
There was a problem hiding this comment.
Hi @bob-carpenter ! Thanks for reviewing.
I didn't mean to submit this change specifically; it's the text editor (in this case, GitHub's editor, but Vim would do the same thing) which adds a "newline at end of file" whenever it's missing.
Indeed, that's what the command-line diff looks like:
-<!-- This comment causes section to be numbered -->
\ No newline at end of file
+<!-- This comment causes section to be numbered -->
I would think it's a healthy change although very minor.
There was a problem hiding this comment.
Ah, missed that---thanks!
| $$ | ||
|
|
||
| Instead of conditioning on the parameters $\nu_c,s_i,g_i$ to obtain $\mathrm{Pr}(\boldsymbol{\alpha_j}=\boldsymbol{\alpha_c}|\boldsymbol{Y}_j=\boldsymbol{y}_j)$, we want to derive the posterior probabilities, averaged over the posterior distribution of the parameters. This is achieved by evaluating the expressions above for posterior draws of the parameters and averaging these over the MCMC iterations. Let the vector of all parameters be denoted $\boldsymbol{\theta}$ and let the posterior draw in iteration $s$ be denoted $\boldsymbol{\theta}^{(s)}_{.}$ Then we estimate the posterior probability, not conditioning on the parameters, as | ||
| Instead of conditioning on the parameters $\nu_c,s_i,g_i$ to obtain $\mathrm{Pr}(\boldsymbol{\alpha_j}=\boldsymbol{\alpha_c}|\boldsymbol{Y}_j=\boldsymbol{y}_j)$, we want to derive the posterior probabilities, averaged over the posterior distribution of the parameters. This is achieved by evaluating the expressions above for posterior draws of the parameters and averaging these over the MCMC iterations. Let the vector of all parameters be denoted $\boldsymbol{\theta}$ and let the posterior draw in iteration $t$ be denoted $\boldsymbol{\theta}^{(t)}_{.}$ Then we estimate the posterior probability, not conditioning on the parameters, as |
There was a problem hiding this comment.
@bob-carpenter Here, the change is only about using
Hello @education-stan !
I've been studying this example as I'm planning on drawing on it and citing it at an upcoming seminar I'll be giving (maybe I should post about it in Discourse, in the spirit of this recent comment). I have emailed author @sylee30 with a question I may have figured meanwhile...
Anyway, I'm suggesting the following edits after running the R code and considering that 'knowledge' is used in an abstract way (knowledge is represented by attribute/skill mastery).
Best,
Marianne